feat: Libraries v2 backup endpoint by rodmgwgu · Pull Request #37419 · openedx/openedx-platform

rodmgwgu · 2025-10-01T22:22:55Z

Description

Adds endpoints for Library v2 backup functionality, related to the new library import/export functionality being implemented in openedx-learning by @dwong2708.

Use Case

Start an asynchronous task to back up the content of a library to a .zip file
Get a status on an asynchronous export task, including the download url

Example Requests

POST /api/libraries/v2/{library_id}/backup/
GET /api/libraries/v2/{library_id}/backup/?task_id={task_id}

POST Response Values

If the import task is started successfully, an HTTP 200 "OK" response is returned.

The HTTP 200 response has the following values:

task_id: UUID of the created task, usable for checking status

Example POST Response

{
    "task_id": "7069b95b-ccea-4214-b6db-e00f27065bf7"
}

GET Parameters

A GET request can include the following parameters:

task_id: (required) The UUID of the task to check.

GET Response Values

If the import task is found successfully by the UUID provided, an HTTP 200 "OK" response is returned.

The HTTP 200 response has the following values:

state: String description of the state of the task
url: (may be null) If the task is complete, a URL to download the .zip file

Example GET Response

{
    "state": "Succeeded",
    "url": "/media/user_tasks/2025/10/03/lib-wgu-csprob-2025-10-03-153633.zip"
}

Supporting information

Resolves: openedx/openedx-core#387
Required by: openedx/frontend-app-authoring#2448

Testing instructions

Create a new library from the studio interface (from the "Libraries Beta" tab, not "~~Legacy Libraries~~")
note the Library key from the URL when looking at the library details (something like lib:WGU:CSPROB)
do a POST to /api/libraries/v2/lib:WGU:CSPROB/backup/ (change the library key with yours)
You should get a response like {"task_id":"747e5ab3-5aef-4aef-88a3-6a195c849e12"}
do a GET to /api/libraries/v2/lib:WGU:CSPROB/backup/?task_id=747e5ab3-5aef-4aef-88a3-6a195c849e12 (change the library key and the task_id to yours)
You should get a response like:

{
    "state": "Succeeded",
    "url": "/media/user_tasks/2025/10/03/lib-wgu-csprob-2025-10-03-153633.zip"
}

Deadline

Ulmo Release

Other information

Depends on changes in openedx-learning, which are partially merged as of Oct 3 2025. This means the export works, but the contents of the generated zip file may be incomplete.

openedx-webhooks · 2025-10-01T22:23:01Z

Thanks for the pull request, @rodmgwgu!

This repository is currently maintained by @openedx/wg-maintenance-edx-platform.

Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review.

🔘 Get product approval

If you haven't already, check this list to see if your contribution needs to go through the product review process.

If it does, you'll need to submit a product proposal for your contribution, and have it reviewed by the Product Working Group.
- This process (including the steps you'll need to take) is documented here.
If it doesn't, simply proceed with the next step.

🔘 Provide context

To help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:

Dependencies

This PR must be merged before / after / at the same time as ...
Blockers

This PR is waiting for OEP-1234 to be accepted.
Timeline information

This PR must be merged by XX date because ...
Partner information

This is for a course on edx.org.
Supporting documentation
Relevant Open edX discussion forum threads

🔘 Get a green build

If one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green.

Details

Where can I find more information?

If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources:

When can I expect my changes to be merged?

Our goal is to get community contributions seen and reviewed as efficiently as possible.

However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:

The size and impact of the changes that it introduces
The need for product review
Maintenance status of the parent repository

💡 As a result it may take up to several weeks or months to complete a review and merge your PR.

dwong2708 · 2025-10-02T17:28:04Z

Could you please include the github issue in your description? Something like "Resolves: openedx/openedx-core#387" at the top.

ormsbee

A few relatively minor questions and requests.

I'm a little uncomfortable with the amount of mocking necessary in the tests, and I do wish we could do a bit more in the way of end-to-end testing (e.g. creating a small library and exporting it). But that's probably not the best use of time at the moment.

Thank you!

ormsbee · 2025-10-03T20:10:37Z

+        - Failed: Task failed and the export did not complete.
+    """
+    ensure_cms("backup_library may only be executed in a CMS context")
+    set_code_owner_attribute_from_module(__name__)


This line shouldn't be necessary.

I see that most of the tasks on this module have the @set_code_owner_attribute decorator, which is not compatible with the @shared_task decorator when using UserTask, that's why I'm manually setting it here, as it's done also on the existing course export/import tasks.

Although I'm not sure I understand 100% why it's needed there, are you sure this is not needed here?

Setting the code owner is something we did at edX so that if the APM error monitoring started throwing a bunch of errors on the endpoint, the person on call would know who to contact to help triage the issue. It also makes it easier for teams to build dashboards around it. I guess there's no harm in keeping it, but I don't think anyone will be looking at the data that it generates.

@robrap, @feanil: Am I remembering that correctly?

Thanks. This decorator is no longer needed. See:

[DEPR]: Code Owner Monitoring edx-django-utils#469

Thanks!, I'll remove it

After removing set_code_owner_attribute_from_module, the "Semgrep code quality" check is now failing with:

test_root.semgrep.celery-missing-code-owner-function
Celery tasks need to be decorated with
@set_code_owner_attribute (from the
edx_django_utils.monitoring module) in order for us to
correctly track code-owners for errors and in other
monitoring.

For more information, see the Celery section of "Add Code_Owner Custom Attributes to an IDA" in the Monitoring How-Tos of <https://docs.openedx.org/projects/edx-django- utils/en/latest/>.

It seems there is a test specific for this still up

Apologies. You can restore it, or remove the test. It’s a DEPR that requires attention at some point.

thanks, restored it for now.

ormsbee · 2025-10-03T20:19:29Z

+
+        with open(file_path, 'rb') as zipfile:
+            artifact = UserTaskArtifact(status=self.status, name='Output')
+            artifact.file.save(name=os.path.basename(zipfile.name), content=File(zipfile))


Will this generated file have a difficult-to-guess URL (e.g. a randomized value for the directory prefix to the file)? My understanding is that UserTaskArtifacts are not secured via auth by default, and that we have to rely on making difficult-to-brute-force URLs.

For the default filesystem store, this is correct, it's not secured by default, however other stores may implement security themselves, for example, the S3 store does have this and generates a temporary signed url for the file.

In my tests, the files generates using the filesystem storage look like "/media/user_tasks/2025/10/03/lib-wgu-csprob-2025-10-03-153633.zip", the only thing that would make them kind of difficult to guess are the datetime in the file name.

I could add a random postfix to the filename, I'm not sure if I can control the directory as it's managed by the UserTasks library.

If the S3 version has a temp signed url for the file, I think that's acceptable. This is one of the things we'll want to include in the release notes.

Here is more info on how this behaves, from the django-user-tasks docs: https://django-user-tasks.readthedocs.io/en/latest/authorization.html#artifact-url-access

ormsbee · 2025-10-04T05:12:35Z

+        # Get the latest task for this user and library
+        task_status = UserTaskStatus.objects.filter(
+            user_id=user_id,
+            name__contains=str(library_key)


So at a low level, I think we'd want to filter by task_class (so there's no mixing of backup and restore tasks), and put a bound on created (automated processes may make many, many tasks over time).

But at a higher level, is this part of the API here to service the use case where someone wanders away from the backup page and then comes back to it? Because if so, I'd rather we just not handle that case yet, and require the user to stay on the page. It shouldn't take that long to create the backup, and it's not clear to me how long a backup should remain "good", or what exactly the logic for invalidating a backup download should be, if we allow people to access historical backups.

ok, I removed that use case, now we always need to specify the task_id. thanks!

ormsbee

Very small request, and then I think we're good to squash and merge. Thank you!

ormsbee · 2025-10-06T21:27:04Z

+    task_status = None
+    if task_id:
+        # Get the specific task by ID
+        try:
+            task_status = UserTaskStatus.objects.get(task_id=task_id, user_id=user_id)
+        except UserTaskStatus.DoesNotExist:
+            return None
+
+    if not task_status:
+        return None


Since task_id is now required, we don't need all this any longer. We can do something more like:

Suggested change

task_status = None

if task_id:

# Get the specific task by ID

try:

task_status = UserTaskStatus.objects.get(task_id=task_id, user_id=user_id)

except UserTaskStatus.DoesNotExist:

return None

if not task_status:

return None

try:

task_status = UserTaskStatus.objects.get(task_id=task_id, user_id=user_id)

except UserTaskStatus.DoesNotExist:

return None

bradenmacdonald · 2025-10-17T21:13:43Z

+
+    if task_status.state == UserTaskStatus.SUCCEEDED:
+        artifact = UserTaskArtifact.objects.get(status=task_status, name='Output')
+        result['url'] = artifact.file.storage.url(artifact.file.name)


@rodmgwgu CC @holaontiveros Do you think we can update this to return an absolute URL, before Ulmo? REST APIs should generally only return absolute URLs, doubly so if the field is called "url", and triply so if we're talking about file storage which may be on a totally separate server/domain like S3. But this API is currently returning just the path part.

I noticed this here in the corresponding frontend PR.

@bradenmacdonald I modeled this one around the existing course export code here, which is doing the same, it seems that if it's S3, it will give us an absolute url, but for filestorage it returns a relative path.

But what you mention makes sense, I'll review this to always return an absoulte URL.

@bradenmacdonald here is the PR addressing this issue: #37508

openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Oct 1, 2025

openedx-webhooks added this to Contributions Oct 1, 2025

github-project-automation Bot moved this to Needs Triage in Contributions Oct 1, 2025

rodmgwgu commented Oct 1, 2025

View reviewed changes

Comment thread openedx/core/djangoapps/content_libraries/api/libraries.py Outdated

rodmgwgu commented Oct 1, 2025

View reviewed changes

Comment thread openedx/core/djangoapps/content_libraries/rest_api/libraries.py

rodmgwgu force-pushed the feat/librarybackupapi branch from ab2af32 to b9d8748 Compare October 2, 2025 15:32

ormsbee reviewed Oct 2, 2025

View reviewed changes

Comment thread openedx/core/djangoapps/content_libraries/tasks.py Outdated

ormsbee reviewed Oct 2, 2025

View reviewed changes

Comment thread openedx/core/djangoapps/content_libraries/tasks.py Outdated

mphilbrick211 added the mao-onboarding Reviewing this will help onboard devs from an Axim mission-aligned organization (MAO). label Oct 2, 2025

mphilbrick211 moved this from Needs Triage to Waiting on Author in Contributions Oct 2, 2025

rodmgwgu force-pushed the feat/librarybackupapi branch 3 times, most recently from a9e1d2e to 2333d13 Compare October 3, 2025 18:08

rodmgwgu marked this pull request as ready for review October 3, 2025 18:48

rodmgwgu requested a review from ormsbee October 3, 2025 18:49

ormsbee moved this from Waiting on Author to In Eng Review in Contributions Oct 3, 2025

ormsbee requested changes Oct 3, 2025

View reviewed changes

rodmgwgu force-pushed the feat/librarybackupapi branch from a164a55 to fd78de7 Compare October 3, 2025 21:59

ormsbee requested changes Oct 4, 2025

View reviewed changes

rodmgwgu force-pushed the feat/librarybackupapi branch from d40ac9e to dc941af Compare October 6, 2025 20:51

rodmgwgu requested a review from ormsbee October 6, 2025 21:19

ormsbee requested changes Oct 6, 2025

View reviewed changes

feat: Implementation of library v2 backup endpoints

d56a1b0

rodmgwgu force-pushed the feat/librarybackupapi branch from dc941af to d56a1b0 Compare October 6, 2025 21:36

ormsbee approved these changes Oct 6, 2025

View reviewed changes

ormsbee merged commit 0c9997c into openedx:master Oct 6, 2025
65 checks passed

github-project-automation Bot moved this from In Eng Review to Done in Contributions Oct 6, 2025

rodmgwgu mentioned this pull request Oct 7, 2025

feat: add library restore endpoint #37439

Merged

holaontiveros mentioned this pull request Oct 9, 2025

[feature] add backup view for libraries v2 openedx/frontend-app-authoring#2532

Merged

7 tasks

bradenmacdonald reviewed Oct 17, 2025

View reviewed changes

rodmgwgu mentioned this pull request Oct 20, 2025

fix: Return an absolute url on libraries v2 backup endpoint. #37508

Merged

Conversation

rodmgwgu commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Use Case

Example Requests

POST Response Values

Example POST Response

GET Parameters

GET Response Values

Example GET Response

Supporting information

Testing instructions

Deadline

Other information

Uh oh!

openedx-webhooks commented Oct 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dwong2708 commented Oct 2, 2025

Uh oh!

Uh oh!

Uh oh!

ormsbee left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ormsbee left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

rodmgwgu commented Oct 1, 2025 •

edited

Loading

openedx-webhooks commented Oct 1, 2025 •

edited

Loading